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(54) Voice-based manipulation method and apparatus 



(57) Disclosed are a voice-based manipulation 
apparatus and a voice-based manipulation method. The 
voice-based manipulation apparatus comprises a stor- 
age section for storing voice information for specifying 
manipulation targets in association with the manipula- 
tion targets; a manipulation section for, when a voice is 
supplied, manipulating that of the manipulation targets 
which is associated with that of the voice information 
stored in the storage section which corresponds to the 
voice; and a search section for searching the voice infor- 
mation stored in the storage section in association with 
the manipulation target and presenting resultant voice 

RG.2 



information. The voice-based manipulation method 
comprises the steps of storing voice information for 
specifying manipulation targets in a storage section in 
association with the manipulation targets; manipulating, 
when a voice is supplied, that of the manipulation tar- 
gets which is associated with that of the voice informa- 
tion stored in the storage section which corresponds to 
the voice; and searching the voice information stored in 
the storage section in association with the manipulation 
target and presenting resultant voice information. 



5 : VOICE CONTROL UNIT 



HCROPHONE =3 



12 
V 



14- 




6-U 



15 
A. 



MICRO PHONE 
AMPLIFIER 



18 



GND 



VOICE 
RECOGNIZER 

1 X ? c 



19d 



GND 




IMITATION 1 

ISOUND GENERATOR | CONTROLLER 



CIRCUIT 

3E 



TO AUDIO UNIT 



LU 



Printed by Xerox (UK) Business Services 
2.16.7 (HRS)/3.6 



1 



EP 1 065 652 A1 



2 



Description 

BACKGROUND OF THE INVENTION 
FIELD OF THE INVENTION 

[0001] The present invention relates to a voice- 
based manipulation technique capable of controlling 
and manipulating electronic devices or the like through 
input voices, and, more particularly, to a voice-based 
manipulation method and apparatus, which allow even 
a user who does not remember registered words to eas- 
ily check the correlation between registered words and 
subjects to be manipulated, thereby improving the oper- 
ability. 

DESCRIPTION OF THE RELATED ART 

[0002] Voice manipulation techniques which permit 
a user to manipulate electronic devices or the like 
through input voices have been proposed. Meanwhile, 
some improvements on voice recognition techniques 
are also been made. With such improved voice recogni- 
tion techniques, there are active developments of elec- 
tronic devices or the like which use voice-based 
manipulation techniques. 

[0003] For example, there is an on-board audio sys- 
tem for a vehicle, which can manipulate voices in the fol- 
lowing manner. Using this audio system, a user 
registers voice data for each of the channel frequencies 
of broadcasting stations. When the user utters some 
words corresponding to one of the registered voice 
data, the audio system recognizes the uttered words 
through a voice recognition technique and automatically 
tunes to the designated channel frequency. 
[0004] More specifically, the user tunes to the chan- 
nel frequency of a desired broadcasting station and 
utters words, for example, "first broadcasting station", 
by manipulating a voice registration button provided on 
the on-board audio system, voice data of the words "first 
broadcasting station" can be stored (registered) in a 
memory in association with that channel frequency. In a 
similar fashion, the user tunes to the channel frequen- 
cies of other broadcasting stations and utters words, 
such as "second broadcasting station" and "third broad- 
casting station". As a result, voice data of the words 
"second broadcasting station", "third broadcasting sta- 
tion" and so forth can be stored in the memory in asso- 
ciation with the tuned channel frequencies. When, after 
this voice registering operation, the user utters one 
stream of words, selected from the registered groups of 
words, such as "first broadcasting station", "second 
broadcasting station" and "third broadcasting station", 
the audio system recognizes the voiced words and 
automatically tunes to the designated channel fre- 
quency. 

[0005] As mentioned above, this on-board audio 
system can permit voice-based manipulation based on 



voice data that has been registered beforehand in asso- 
ciation with subjects to be manipulated (hereinafter 
referred to as "manipulation targets"). But, users are 
likely to forget registered words or forget the correlation 
5 between the registered words and manipulation targets. 
In this case, each user may have to, for example, repeat 
the above-described voice registering operation to 
change old voice data stored in the memory to new 
voice data. 

10 [0006] It is desirable to ensure voice registration of 
any words, not specific words, thereby improving the 
operability for users. If such a highly general-purpose 
design is taken, the designed audio system, though 
effective in many ways, would suffer a lower operability, 

15 because users are apt to forget registered words. 

[0007] While the tuning operation of an on-board 
audio system has been specifically discussed to show 
the problem of the conventional voice-based manipula- 
tion techniques, the same problem arise in the case 

20 where a user who is likely to forget registered words 
loads a recording/reproducing medium in an MD (Mini 
Disc) player, CD (Compact Disc) player or the like, 
which is installed in an on-board audio system and 
selects a musical piece, a title or the like, recorded on 

25 that medium, with voices. 

[0008] The fact that users may forget registered 
words is the problem that should be overcome not only 
for on-board audio systems for vehicles but also the 
voice-based manipulation techniques. 

30 

SUMMARY OF THE INVENTION 

[0009] Accordingly, it is an object of the present 
invention to provide a voice-based manipulation method 

35 and apparatus, which allow even a user who has forgot- 
ten registered words to easily check the correlation 
between registered words and manipulation targets, 
thereby ensuring an improved operability. 
[0010] To achieve the above object, according to 

40 one aspect of this invention, there is provided a voice- 
based manipulation apparatus which comprises a stor- 
age section for storing voice information for specifying 
manipulation targets in association with the manipula- 
tion targets; a manipulation section for, when a voice is 

45 supplied, manipulating that of the manipulation targets 
which is associated with that of the voice information 
stored in the storage section which corresponds to the 
voice; and a search section for searching the voice infor- 
mation stored in the storage section in association with 

50 the manipulation target and presenting resultant voice 
information. 

[0011] According to another aspect of this inven- 
tion, there is provided a voice-based manipulation 
method which comprises the steps of storing voice 
55 information for specifying manipulation targets in a stor- 
age section in association with the manipulation targets; 
manipulating, when a voice is supplied, that of the 
manipulation targets which is associated with that of the 
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voice information stored in the storage section which 
corresponds to the voice; and searching the voice infor- 
mation stored in the storage section in association with 
the manipulation target and presenting resultant voice 
information. 5 
[0012] With the above structures, a user can 
acquire voice information which is searched and pre- 
sented by the search section (or the searching step). 
Even if the user forgets, or is uncertain about voice 
information stored (registered) in the storage section, 10 
the user can easily check the correlation between the 
voice information and the manipulation target which is 
associated with that voice information. Even when the 
user does not remember voice information, therefore, it 
is unnecessary to store voice information again in the 15 
storage section, resulting in an improved operability. 
[0013] It is preferable in the above voice-based 
manipulation apparatus and method that, in response a 
search instruction externally supplied, the search sec- 
tion or the searching step should detect an active 20 
manipulation target, search for that voice information 
which is associated with the detected active manipula- 
tion target and present the searched voice information. 
[0014] In this case, when voice information associ- 
ated with the active manipulation target is not stored in 25 
the storage section, the search section or the searching 
steps may search other voice information stored in the 
storage section in association with the manipulation tar- 
get and present the searched voice information. 
[0015] In the above two preferable modes, it is fur- 30 
ther preferable that in response to the search instruction 
externally supplied, the search section or the searching 
step should search the voice information stored in the 
storage section in a predetermined order in association 
with the manipulation target and present the searched 35 
voice information. 

[001 6] In this case, the predetermined order may be 
an alphabetical order, a forward sort direction or a 
reverse sort direction. 

[0017] In the voice-based manipulation apparatus 40 
according to the first aspect, the voice-based manipula- 
tion method according to the second aspect, or any one 
of the above-described preferable modes, the storage 
section can store the voice information again and may 
store a supplied voice as voice information associated 45 
with an active manipulation target at the time of storing 
the voice information again. 

BRIEF DESCRIPTION OF THE DRAWINGS 

50 

[001 8] Other aspects and advantages of the inven- 
tion will become readily apparent from the following 
description, taken in conjunction with the accompanying 
drawings, illustrating by way of example the principals of 
the invention. 55 

FIG. 1 is a plan view showing the outer appearance 
of a voice-based manipulation apparatus according 



4 

to one embodiment of this invention; 

FIG. 2 is a block diagram illustrating the structure of 

a signal processor incorporated in a voice control 

unit; 

FIGS. 3A through 3C are diagrams respectively 
showing individual memory maps of a title designa- 
tion voice data memory table, a unit designation 
voice data memory table and an adjusted voice 
data memory table; 

FIGS. 4A and 4B are explanatory diagrams illustrat- 
ing the functions of a normal registration/voice 
operation key; 

FIGS. 5A and 5B are explanatory diagrams illustrat- 
ing the functions of a unit registration/search key; 
FIGS. 6A and 6B are explanatory diagrams illustrat- 
ing the functions of an adjusted voice registra- 
tion/search key; 

FIGS. 7 A and 7B are explanatory diagrams illustrat- 
ing the functions of a volume control/guidance lan- 
guage switching key; 

FIGS. 8A and 8B are explanatory diagrams illustrat- 
ing the functions of a search/forward scan key; 
FIGS. 9A and 9B are explanatory diagrams illustrat- 
ing the functions of a search/reverse scan key; 
FIG. 10 is a flowchart illustrating the operation of 
the voice-based manipulation apparatus according 
to this embodiment in standby mode; 
FIG. 1 1 is a flowchart showing the operation of the 
apparatus in voice registration mode; 
FIG. 12 is a flowchart showing the operation of the 
apparatus in unit designation voice registration 
mode; 

FIG. 13 is a flowchart showing the operation of the 
apparatus in equalizer adjusted voice registration 
mode; 

FIG. 14 is a flowchart illustrating the operation of 
the apparatus in voice-based manipulation mode; 
and 

FIGs. 15A and 15B are flowcharts illustrating the 
operation of the apparatus in registered voice data 
search mode. 

DETAILED DESCRIPTION OF THE PREFERRED 
EMBODIMENT 

[0019] With reference to the accompanying draw- 
ings, a description will now be given of a preferred 
embodiment of the present invention as adapted to a 
voice-based manipulation apparatus that allows a user 
to perform the voice-based manipulation of an on-board 
audio system for a vehicle which is equipped with a 
reception tuner for receiving radio broadcast waves or 
the like, an MD player for playing an MD, a CD player for 
playing a CD, an equalizer for adjusting a frequency 
characteristic, an amplifier for controlling the volume 
and so forth. (Those components of the on-board audio 
system will hereinafter be called "audio units".) 
[0020] FIG. 1 shows the outer appearance of a 
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voice-based manipulation apparatus 1 , and FIG. 2 illus- 
trates the structure of a signal processor which is incor- 
porated in a voice control unit 2. 
[0021] Referring to FIG. 1 , the voice-based manipu- 
lation apparatus 1 comprises the voice control unit 2, 
which is the main unit to control the aforementioned 
individual audio units, a microphone 3 through which a 
user inputs a voice to give an instruction to the voice 
control unit 2, and a remote operation section 4. 
[0022] The remote operation section 4 has a small 
speaker 5 and push-button type operational button 
switches 6 to 11. The operational button switch 6 is 
called a "normal registration/voice operation key", the 
operational button switch 7 a "search/forward scan key", 
the operational button switch 8 a "search/reverse scan 
key", the operational button switch 9 a "unit registra- 
tion/search key", the operational button switch 10 an 
"adjusted voice registration/search key", and the opera- 
tional button switch 11a "volume control/guidance lan- 
guage switching key". Those keys have predetermined 
functions which will be discussed later. 
[0023] As shown in FIG. 2, the microphone 3 and 
the remote operation section 4 are connected to a con- 
nector 14 of the voice control unit 2 via connection 
cables 1 2 and 1 3, respectively. 

[0024] Referring to FIG. 2, the voice control unit 2 
includes an amplifier (microphone amplifier) 15, a voice 
recognizer 1 8 and a voice data memory 1 9. As the user 
utters words, a voice signal is supplied from the micro- 
phone 3 to the microphone amplifier 15 via the connec- 
tion cable 12. The microphone amplifier 15 amplifies the 
voice signal and sends it to the voice recognizer 1 8. The 
voice recognizer 18 performs voice recognition on the 
received voice signal. The voice data memory 1 9, which 
is a non-volatile memory, stores voice data recognized 
by the voice recognizer 1 8. 

[0025] The voice data memory 1 9 has a title desig- 
nation voice data memory table 19a, a unit designation 
voice data memory table 19b, an adjusted voice data 
memory table 19c, and a guidance data memory table 
19d. The first three tables 19a to 19c store the voice 
data supplied from the voice recognizer 18. The last 
table 1 9d p restores voice guidance data for generating 
voice guidances which will be discussed later. 
[0026] As exemplarily shown in FIG. 3A, the title 
designation voice data memory table 1 9a is provided to 
store (register) information, such as a musical piece, 
which is being played by an active or currently operating 
audio unit, its title and the channel frequency of a broad- 
casting station, in association with data of voices 
uttered by the user (voice data). The unit designation 
voice data memory table 1 9b, as exemplarily shown In 
FIG. 3B, serves to store (register) the name of an audio 
unit in operation in association with data of voices 
uttered by the user (voice data). As exemplarily shown 
in FIG. 3C, the adjusted voice data memory table 19c 
serves to store (register) information on the setting state 
of the equalizer and the set positioning in association 



with data of voices uttered by the user (voice data). 
[0027] The voice control unit 2 further includes an 
amplifier (speaker amplifier) 16, an imitation sound gen- 
erator 17, a voice synthesizer 20, a controller 21, an 

5 interface (l/F) circuit 22 and an interface port 23. 

[0028] The imitation sound generator 17 generates 
an imitation sound signal, such as "Peep" or "Beep". 
The voice synthesizer 20 generates a guidance voice 
signal based on the voice data or the voice guidance 

w data stored in the voice data memory 1 9. The speaker 
amplifier 16 amplifies those guidance voice signal and 
imitation sound signal and sends the amplified signals 
via the connection cable 13 to the speaker 5 in the 
remote operation section 4. 

15 [0029] The controller 21 receives operation signals 
from the individual operational button switches 6-1 1 via 
the connection cable 13 and controls the individual 
audio units. The l/F circuit 22 and the interface port 23 
permit bidirectional communications between the con- 

20 trailer 21 and each audio unit. 

[0030] The controller 21 is provided with a micro- 
processor which runs a preset system program to con- 
trol the general operation of the voice-based 
manipulation apparatus 1 and the operations of the indi- 

25 vidual audio units. 

[0031] The operation of the voice-based manipula- 
tion apparatus 1 with the above-described structure will 
be discussed below referring to FIGS. 3A to 15. FIGS. 
3A through 3C respectively show the individual memory 

30 maps of the title designation voice data memory table 
19a, the unit designation voice data memory table 19b 
and the adjusted voice data memory table 19c. FIGS. 
4A through 9B are explanatory diagrams illustrating the 
functions of the operational button switches 6-1 1 . FIGS. 

35 10 through 15 are flowcharts for explaining operational 
examples of the voice-based manipulation apparatus 1 
when the user operates the operational button switches 
6-11. 

[0032] As illustrated in FIGS. 4A through 9B, when 

40 the user depresses one of the operational button 
switches 6-1 1 for a short time or for 2 or more seconds, 
the mode that matches with the user's operation is set. 
[0033] According to this embodiment, the modes 
are classified into three kinds of modes: a registration 

45 mode for previously registering voice data necessary for 
voice-based manipulations in the title designation voice 
data memory table 19a, the unit designation voice data 
memory table 19b and the adjusted voice data memory 
table 19c, an operation mode for ensuring voice-based 

so manipulations as the user utters voices corresponding 
to the voice data that are registered in those voice data 
memory tables 19a-19c, and a search mode for allowing 
the user to check the voice data registered in those 
voice data memory tables 1 9a-1 9c. 

55 [0034] In FIG. 10, as the main power source of an 
on-board audio system is switched on, the voice-based 
manipulation apparatus 1 is automatically powered on 
and the controller 21 stands by until one of the opera- 
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tional button switches 6-11 is operated (steps 100 to 
120). When the user manipulates one of the operational 
button switches 6-1 1 for a short time or for 2 or more 
seconds during this standby process, the mode that cor- 
responds to the user's operation is set, as shown in 
FIGS. 4A through 9B. 

[0035] When it is determined in step 102 that the 
normal registration/voice operation key 6 has been con- 
tinuously depressed for 2 or more seconds, the mode is 
set to the voice registration mode and the operation 
goes to a routine shown in FIG. 1 1 . In voice registration 
mode, first, the controller 21 sets u 1 " to a program coun- 
ter constructed by the system program and carries out a 
sequence of processes starting at step 200. 
[0036] In step 200, the voice synthesizer 20 reads 
predetermined voice guidance data from the guidance 
data memory table 1 9d and generates a guidance voice 
signal, and the imitation sound generator 17 generates 
an imitation sound signal of "Peep". 
[0037] The controller 21 supplies those guidance 
voice signal and imitation sound signal to the speaker 
amplifier 16 and reproduces "Register title .. Peep", a 
guidance sound comprised of a guidance voice and imi- 
tation sound, from the speaker 5, requesting the user to 
utter a voice to be registered. 

[0038] In the next step 202, the voice recognizer 1 8 
initiates a voice recognition process. When the user 
utters desired words in response to the guidance sound, 
the voice recognizer 18 detects the beginning of this 
voice generation, at which point a program timer in the 
controller 21 is activated so that the voice recognizer 1 8 
is controlled to execute voice recognition of the uttered 
voice within 2.5 seconds. 

[0039] More specifically, before giving the guidance 
sound, the voice recognizer 18 measures sounds 
(power of ambient sounds) which are picked up by the 
microphone 3 and are input via the microphone ampli- 
fier 15, and sets the power level of the ambient sounds 
as a noise level. The output signal of the microphone 
amplifier 15 is added up every 10 milliseconds, meas- 
ures each added value as a sound power level and sets 
a first threshold value THD1 , higher than the power level 
of the ambient sounds, every 1 0 milliseconds. 
[0040] When the user utters a voice, the voice rec- 
ognizer 18 compares the level of the uttered voice 
(voice power) with the latest first threshold value THD1 
and determines the point when the level of the uttered 
voice becomes greater than the first threshold value 
THD1 as the beginning of voice generation. The pro- 
gram timer is activated at the beginning of voice gener- 
ation, and the voice recognizer 18 recognizes the 
uttered voice within 2.5 seconds and generates voice 
data corresponding to the recognition result. 
[0041] At this point, the voice recognizer 18 further 
compares the level of the uttered voice (voice power) 
with a second threshold value THD2 (fixed value) which 
is preset higher than the first threshold value THD1 , and 
determines that voice recognition has been carried out 



properly when the voice power becomes greater than 
the second threshold value THD2. That is, when the 
level of the uttered voice becomes higher than the latest 
first threshold value THD1 and then becomes higher 

5 than the second threshold value THD2, the uttered 
voice is taken as the subject to be recognized. This 
allows the property of the uttered voice which is less 
influenced by noise to be extracted accurately, thus 
improving the precision of voice recognition. 

w [0042] In the next step 204, it is determined from 
the action of the timer or a variation in level whether or 
not voice recognition has been completed. Then, it is 
determined if voice recognition has been performed 
properly in step 206. This decision is made by checking 

15 if the level of the uttered voice (voice power) input as a 
recognition target has been higher than the first and 
second threshold values THD1 and THD2. When it is 
determined that voice recognition has been done prop- 
erly, the flow goes to step 208. 

20 [0043] In step 208, the controller 21 receives infor- 
mation of an audio unit in operation and information 
which is currently reproduced by that audio unit via the 
l/F circuit 22 and the interface port 23, and stores the 
received data and the voice data generated by the voice 

25 recognizer 1 8 in the title designation voice data memory 
table 19a in association with each other (in combina- 
tion). 

[0044] Suppose that the audio unit in operation is a 
CD player which is currently playing a musical piece or 

30 the like on track 1 of a recording/reproducing medium 
(CD). If the user utters a word "one" in step 202, the 
received data becomes "disci trackl" and the voice 
data has word information of "one". Those received data 
and voice data are stored (registered) as registered 

35 voice data in the title designation voice data memory 
table 19a in association with each other. 
[0045] As another example, suppose that the audio 
unit in operation is a radio tuner which is currently tuned 
to a broadcasting station having a channel frequency of 

40 76.1 MHz. If the user utters a word "seven" in step 202, 
the received data about the channel frequency of 76.1 
MHz and the voice data "seven" are stored (registered) 
as registered voice data in the title designation voice 
data memory table 19a in association with each other. 

45 [0046] In other words, in voice registration mode, 
voice data corresponding to the voice uttered by the 
user is registered in the title designation voice data 
memory table 19a in association with information, such 
as the musical piece that is currently played by an audio 

so unit in operation and the title of the musical piece or the 
received channel frequency, as shown in FIG. 3A. 
[0047] When the registration of voice data is com- 
pleted, the flow advances to step 210 where the voice 
synthesizer 20 reads predetermined voice guidance 

55 data from the guidance data memory table 19d and 
generates a guidance voice signal. The controller 21 
supplies the guidance voice signal to the speaker ampli- 
fier 16 and reproduces a guidance sound, "Registered", 
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from the speaker 5, informing the user of the end of the 
registration. After the voice registration mode is com- 
pleted, the operation goes again to the standby mode 
and starts the routine in FIG. 10 again at step 100. 
[0048] When it is determined in step 206 that voice 
recognition has not been done properly, the flow moves 
to step 212. In step 212, the controller 21 checks the 
value of the program counter to determine if the check 
is the second time. If it is the second time, the flow goes 
to step 214. 

[0049] In step 21 4, the imitation sound generator 1 7 
generates an imitation sound signal of "Beep Beep". 
The controller 21 sends this imitation sound signal of 
"Beep Beep" to the speaker amplifier 16 and then out- 
puts a guidance sound of "Beep Beep" from the speaker 
5, notifying registration failure. When the voice registra- 
tion mode is ended, the operation comes again to the 
standby mode and starts the routine in FIG. 10 again at 
step 1 00. In other words, if the property of the uttered 
voice cannot be extracted accurately due to the influ- 
ence of noise or the like, the user should perform the 
registering operation from the start. 
[0050] When it is determined in step 212 that the 
value of the program counter is "1 the flow goes to step 
21 6. In step 21 6, the count value of the program counter 
is checked to determine whether or not the voice regis- 
tration has taken 2.5 seconds or longer. 
[0051] When the voice registration has taken 2.5 
seconds or longer, the voice synthesizer 20 reads pre- 
determined voice guidance data from the guidance data 
memory table 19d and generates a guidance voice sig- 
nal, and the imitation sound generator 17 generates an 
imitation sound signal of "Peep". The controller 21 sup- 
plies those guidance voice signal and imitation sound 
signal to the speaker amplifier 1 6 and reproduces "Beep 
.. Too long" from the speaker 5, warning the user that 
the time for the voice registration is too long. 
[0052] If the voice registration mode has not been 
carried out properly due to some other factors, the voice 
synthesizer 20 reads predetermined voice guidance 
data from the guidance data memory table 19d and 
generates a guidance voice signal, and the imitation 
sound generator 17 generates an imitation sound signal 
of "Peep". Then, the controller 21 supplies those guid- 
ance voice signal and imitation sound signal to the 
speaker amplifier 1 6 and reproduces "Beep .. Try again" 
from the speaker 5, requesting the user to make voice 
input again. 

[0053] When this notification is completed, "2" is set 
in the program counter and the operation restarts at 
step 200 to allow the user to utter desired words again. 
In other words, the step 216 mainly gives a warning to 
the effect that the way the user utters a voice has not 
been adequate. When the user properly utters intended 
words again in response to this warning, the voice data 
is registered in step 208. Therefore, the user can regis- 
ter adequate voice data without manipulating the nor- 
mal registration/voice operation key 6 again, which 



demonstrates an improved operability. 
[0054] Once the user continuously depresses the 
normal registration/voice operation key 6 for 2 or more 
seconds, merely uttering words according to a guidance 

5 sound can cause the words uttered by the user to be 
registered in the title designation voice data memory 
table 19a in association with information, such as the 
musical piece that is currently played by an audio unit in 
operation and the title of the musical piece or the chan- 

w nel frequency of a broadcasting station. That is, it is 
possible to make voice registration of the information 
itself that the user wants, not the name of an audio unit. 
After this registering operation, the user has only to utter 
words corresponding to any registered voice data in 

15 order to ensure voice-based manipulation (whose 
details will be given later) for designating the musical 
piece, the title thereof, the broadcasting station and so 
forth. 

[0055] A description will now be given of the opera- 
20 tion in the case where it is determined in step 104 that 
the unit registration/search key 9 has been continuously 
depressed for 2 or more seconds. When the depression 
of this key 9 continues for 2 or more seconds, the mode 
is set to the unit designation voice registration mode 
25 and the operation goes to a routine shown in FIG. 1 2. 
[0056] In unit designation voice registration mode, 
first, the controller 21 sets "1" to the program counter 
constructed by the system program and carries out a 
sequence of processes starting at step 300. 
30 [0057] In step 300, as in step 200 in FIG. 1 1 , a guid- 
ance sound of "Register unit name .. Peep" is repro- 
duced, requesting the user to utter a voice to be 
registered. 

[0058] In the next step 302, as in step 202, the voice 
35 recognizer 18 initiates a voice recognition process. 
When the user utters desired words in response to the 
guidance sound, the voice recognizer 18 detects the 
beginning of this voice generation, at which point the 
program timer in the controller 21 is activated so that the 
40 voice recognizer 1 8 is controlled to execute voice recog- 
nition of the uttered voice within 2.5 seconds. 
[0059] After the end of voice recognition is con- 
firmed in the next step 304, it is determined in step 306 
if voice recognition has been performed properly as 
45 done in step 206. When it is determined that voice rec- 
ognition has been done properly, the flow goes to step 
308. 

[0060] In step 308, the controller 21 detects an 
audio unit in operation and stores the detected data and 

so the voice data generated by the voice recognizer 1 8 in 
the unit designation voice data memory table 19b in 
association with each other (in combination). 
[0061 ] Assuming that the audio unit in operation is a 
CD player, when the user utters a word "CD" (si:di:) in 

55 step 302, the detected data becomes "cd" and the voice 
data has word information of n si:di:". Those detected 
data and voice data are stored as registered voice data 
in the unit designation voice data memory table 19b in 
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association with each other. 

[0062] Assuming that the audio unit in operation is a 
radio tuner, as another example, when the user utters a 
word "tuner" (t(j)u:ner) in step 302, the detected data 
becomes "t(J)u:ner" and those detected data and voice 
data are stored as registered voice data in the unit des- 
ignation voice data memory table 19b in association 
with each other. 

[0063] In other words, in unit designation voice reg- 
istration mode, voice data corresponding to the voice 
uttered by the user is registered in the unit designation 
voice data memory table 19b in association with the 
name of the audio unit in operation, as shown in FIG. 
3B. 

[0064] When the registration of voice data is com- 
pleted, the flow advances to step 31 0 where, as in step 
210, a guidance sound of "Registered" is output from 
the speaker 5, informing the user of the end of the reg- 
istration. After the voice registration mode is completed, 
the operation goes again to the standby mode and 
starts the routine in FIG. 10 again at step 100. 
[0065] When it is determined in step 306 that voice 
recognition has not been done properly, the flow moves 
to step 312. In step 312, as in step 212, the controller 21 
checks the value of the program counter to determine if 
the check is the second time. If it is the second time, the 
flow goes to step 314. 

[0066] In step 314, as in step 214, a guidance 
sound of "Beep Beep" is reproduced from the speaker 
5, notifying registration failure. When the voice registra- 
tion mode is ended, the operation comes again to the 
standby mode and starts the routine in FIG. 10 again at 
step 100. That is, if the property of the uttered voice 
cannot be extracted accurately due to the influence of 
noise or the like, the user should perform the registering 
operation from the start. 

[0067] When it is determined in step 312 that the 
value of the program counter is "1 °, the flow goes to step 
316. In step 316, as in step 216, it is determined 
whether or not the voice registration has taken less than 
2.5 seconds. When the voice registration has taken 2.5 
seconds or longer, a guidance sound of "Beep .. Too 
long" is reproduced from the speaker 5, warning the 
user that the time for the voice registration is too long. If 
the voice registration mode has not been carried out 
properly due to some other factors, a guidance sound of 
"Beep Try again" is reproduced from the speaker 5, 
requesting the user to make voice input again. 
[0068] When this notification is completed, "2" is set 
in the program counter and the operation restarts at 
step 300 to allow the user to utter desired words again. 
In other words, the step 316 mainly gives a warning to 
the effect that the way the user utters a voice has not 
been adequate. When the user properly utters intended 
words again in response to this warning, the voice data 
is registered in step 308. Therefore, the user can regis- 
ter adequate voice data without manipulating the unit 
registration/search key 9 again, thus leading to an 



improved operability. 

[0069] Once the user continuously depresses the 
unit registration/search key 9 for 2 or more seconds, 
merely uttering words according to a guidance sound 

5 can cause the uttered words to be registered in the unit 
designation voice data memory table 19b in association 
with the name of the audio unit in operation. After this 
registering operation, the user has only to utter words 
corresponding to any registered voice data in order to 

w ensure voice-based manipulation (whose details will be 
given later) for designating an audio unit. 
[0070] A description will now be given of the opera- 
tion in the case where it is determined in step 106 that 
the adjusted voice registration/search key 10 has been 

15 continuously depressed for 2 or more seconds. When 
the depression of this key 10 continues for 2 or more 
seconds, the mode is set to the equalizer adjusted voice 
registration mode and the operation goes to a routine 
shown in FIG. 13. 

20 [0071] First, the voice synthesizer 20 reproduces a 
guidance sound of "Register equalizer mode" in step 
400. In the next step 402, the controller 21 restarts the 
program counter constructed by the system program to 
measure the time for one second. It is determined in 

25 steps 404 and 406 within this one second if the adjusted 
voice registration/search key 10 has been depressed for 
a short time or any one of the other operation keys 6-9 
and 1 1 has been depressed for a short time. 
[0072] When it is the adjusted voice registra- 

30 tion/search key 10 that has been depressed for a short 
time, the flow goes to step 408. When it is one of the 
other operation keys 6-9 and 11 that has been 
depressed for a short time, the flow goes to step 410. 
When none of the operation keys 6-1 1 has been manip- 

35 ulated within one second, the flow goes to step 420. 
[0073] When it is determined in step 406 that a key 
other than the adjusted voice registration/search key 1 0 
but one of the other operation keys 6-9 and 1 1 has been 
depressed for a short time, and the flow goes to step 

40 410, a process corresponding to the depressed opera- 
tion key and the flow returns to step 100 in FIG. 10. 
[0074] When it is determined in step 404 that the 
adjusted voice registration/search key 10 has been 
depressed for a short time, and the flow goes to step 

45 408, the voice synthesizer 20 reproduces a guidance 
sound of "Register listening position" after which the 
flow moves to step 412. In step 412, the program timer 
is restarted to measure the time for one second. 
[0075] In steps 41 4 and 41 6, it is determined within 

so this one second whether the adjusted voice registra- 
tion/search key 1 0 or one of the other operation keys 6- 
9 and 1 1 has been depressed for a short time. When the 
adjusted voice registration/search key 10 has been 
depressed for a short time, the flow returns to step 400. 

55 When one of the other operation keys 6-9 and 1 1 has 
been depressed for a short time, a process correspond- 
ing to the depressed key is performed, then the flow 
returns to step 100 in FIG. 10. 
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[0076] In steps 402-418, when the adjusted voice 
registration/search key 10 is depressed once for a short 
time, the mode is set to a voice registration mode for 
setting the frequency characteristic of the equalizer as 
an audio unit, and when the second depression of the 
adjusted voice registration/search key 10 is made within 
the first one second, the mode is set to a voice registra- 
tion mode for setting each output level (listening posi- 
tion) of each channel of the stereo speaker, before the 
flow advances to step 420. 

[0077] When a key different from the adjusted voice 
registration/search key 10 but one of the operation keys 
6-9 and 1 1 is depressed for a short time within the first 
one second or within the next one second, a process 
corresponding to the depressed key is carried out. 
[0078] In the subsequent step 420, the voice syn- 
thesizer 20 reproduces a guidance sound of "Peep" to 
inform the user of the beginning of registration. In step 
422, the voice recognizer 18 performs voice recognition 
on the voice uttered by the user in accordance with that 
guidance sound. In this case, accurate voice recognition 
is executed by extracting the uttered voice based on the 
first and second threshold values THD1 and THD2, as 
in the case illustrated in FIGS. 1 1 and 12. 
[0079] Then, it is determined in step 424 if voice 
recognition has been carried out properly. When voice 
recognition has been performed adequately, the flow 
goes to step 426. 

[0080] In step 426, the controller 21 detects the 
present setting state of the equalizer via the l/F circuit 
22 and the interface port 23 and stores the detected 
data and the voice acquired by the voice recognition in 
the adjusted voice data memory table 1 9c in association 
with each other (in combination). 
[0081] When the operation goes to step 420 from 
step 402, i.e., when the user has instructed the voice 
registration mode for setting the frequency characteris- 
tic of the equalizer and the user has adjusted the equal- 
izer to "super bass" and utters words "super bass" 
(s(j)u:per), then the state of the "super bass" of the 
equalizer and the voice data of "super bass" are stored 
in the adjusted voice data memory table 19c. 
[0082] When the operation goes to step 420 from 
step 412, i.e., when the user has instructed the voice 
registration mode for setting the listening position and 
the user has adjusted the state of the speaker output to 
"front right" and utters a word "right" (rait) , then the 
state of the "front right" and the voice data of "right" are 
stored in the adjusted voice data memory table 19c. 
[0083] Then, a guidance sound of "Registered" is 
reproduced from the speaker 5, notifying the user of the 
end of the registration. After the voice registration mode 
is completed, the operation goes again to the standby 
mode and starts the routine in FIG. 10 again at step 
100. 

[0084] When it is determined in step 424 that voice 
recognition has not been done properly, the flow moves 
to step 428, but when it is the second time, the flow goes 



to step 430, as done in step 212 in FIG. 1 1 . 
[0085] In step 430, as in step 214, a guidance 
sound of "Beep Beep" is reproduced from the speaker 
5, notifying registration failure. When the voice registra- 

5 tion mode is ended, the operation comes again to the 
standby mode and starts the routine in FIG. 10 again at 
step 100. That is, if the property of the uttered voice 
cannot be extracted accurately due to the influence of 
noise or the like, the user should perform the registering 

w operation from the start. 

[0086] When it is determined in step 428 that the 
value of the program counter is "1 ", the flow goes to step 
432 where, as in step 216, it is determined whether or 
not the voice registration has taken less than 2.5 sec- 

15 onds. When the voice registration has taken 2.5 sec- 
onds or longer, a guidance sound of "Beep .. Too long" 
is reproduced from the speaker 5, warning the user that 
the time for the voice registration is too long. If the voice 
registration mode has not been carried out properly due 

20 to some other factors, a guidance sound of "Beep .. Try 
again" is reproduced from the speaker 5, requesting the 
user to make voice input again. 

[0087] When this notification is completed, the 
operation restarts at step 420 to allow the user to utter 
25 desired words again. Therefore, the user can register 
adequate voice data without manipulating the adjusted 
voice registration/search key 10 again. This results in an 
improvement of the operability. 

[0088] Once the user depresses the adjusted voice 
30 registration/search key 10, merely uttering words 
according to a guidance sound can cause the uttered 
words to be registered in the adjusted voice data mem- 
ory table 19c in association with the current adjustment 
state of the equalizer. After this registering operation, 
35 the user has only to utter words corresponding to any 
registered voice data in order to ensure voice-based 
manipulation (whose details will be given later) for 
adjusting the equalizer. 

[0089] A description will now be given of the opera- 

40 tion in the case where it is determined in step 108 in 
FIG. 10 that the volume control/guidance language 
switching key 1 1 has been continuously depressed for 2 
or more seconds. When the depression of this key 1 1 
continues for 2 or more seconds, the mode is set to the 

45 language switching mode and the controller 21 changes 
the voice guidance data stored in the guidance data 
memory table 19d and performs some setting to turn off 
the generation of a guidance sound, as shown in FIG. 
7A. The guidance data memory table 19d prestores 

so voice guidance data in plural countries, such as English, 
German and French, in addition to voice guidance data 
in Japanese. Every time the volume control/guidance 
language switching key 1 1 is depressed for 2 or more 
seconds, the controller 21 sequentially controls the 

55 changing of the voice guidance data in each country 
and the disabling of the generation of a guidance sound. 
Therefore, the user can set the language of guidance 
voices to a desired country's language and set off guid- 
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ance voices by operating the volume control/guidance 
language switching key 1 1 . 

[0090] A description will now be given of the opera- 
tion in the case where it is determined in step 110 in 
FIG. 10 that the volume control/guidance language 
switching key 1 1 has been depressed for a short time. 
When the depression of this key 1 1 continues for a short 
time, the mode is set to the volume control mode and 
the controller 21 sequentially switches the amplification 
factor of the speaker amplifier 16 within the range of 
three levels of high, middle and low as shown in FIG. 
7B. Therefore, the user can adjust the output volume of 
the speaker 5 to one of the high volume, middle volume 
and low volume by operating the key 1 1 . 
[0091] A description will now be given of the opera- 
tion in the case where it is determined in step 112 in 
FIG. 10 that the normal registration/voice operation key 
6 has been for a short time. 

[0092] When the normal registration/voice opera- 
tion key 6 is depressed for a short time, the mode is set 
to the voice-based manipulation mode and the opera- 
tion goes to a routine shown in FIG. 14. In FIG. 14, first, 
the controller 21 sets "1" in the program counter and 
performs a sequence of processes starting at step 450. 
[0093] In step 450, the voice synthesizer 20 reads 
predetermined voice guidance data from the guidance 
data memory table 19d and the imitation sound genera- 
tor 17 generates an imitation sound signal of "Peep". 
[0094] The controller 21 sends those guidance 
voice signal and imitation sound signal to the speaker 
amplifier 16 and reproduces a guidance sound of 
"Please make request Peep", which consists of the 
guidance voice and imitation sound, from the speakers, 
thus requesting the user to utter a voice for voice-based 
manipulation. 

[0095] In the next step 452, the voice recognizer 1 8 
starts the voice recognition process. When the user 
utters an intended voice (words) corresponding to any 
of the voice data that are stored in the title designation 
voice data memory table 1 9a, the unit designation voice 
data memory table 19b and the adjusted voice data 
memory table 19c, the voice recognizer 18 detects the 
beginning of the voice generation, at which point the 
program timer in the controller 21 is activated so that the 
voice recognizer 18 is controlled to execute voice recog- 
nition of the uttered voice within 2.5 seconds. In this 
case, accurate voice recognition is carried out by 
extracting the uttered voice based on the first and sec- 
ond threshold values THD1 and THD2, which are higher 
than the level of ambient noise, as in the case of the 
above-described voice registration mode. 
[0096] In the next step 454, it is determined whether 
or not voice recognition has been completed. Then, it is 
determined if voice recognition has been performed 
properly in step 456. This decision is made by checking 
if the level of the uttered voice (voice power) input as a 
recognition target has been higher than the first and 
second threshold values THD1 and THD2. When it is 



determined that voice recognition has been done prop- 
erly, the flow goes to step 458. 

[0097] In step 458, the voice synthesizer 20 reads 
predetermined voice guidance data from the guidance 

5 data memory table 19d and the controller 21 sends this 
guidance voice signal to the speaker amplifier 1 6 to out- 
put a guidance sound of "OK" from the speaker 5, thus 
giving acknowledgement information to the user. Fur- 
ther, the controller 21 searches the registered voice 

w data in the title designation voice data memory table 
19a based on the voice data acquired through the voice 
recognition and acquires information about an audio 
unit corresponding to that voice data (the aforemen- 
tioned registered, received data). Then, the controller 

15 21 generates a control signal based on the acquired 
information, and sends the control signal via the l/F cir- 
cuit 22 and the interface port 23 to the audio unit speci- 
fied by the user, thereby activating the audio unit. Then, 
the voice-based manipulation mode is ended and the 

20 operation comes to the standby mode to start the rou- 
tine in FIG. 10 again at step 100. 
[0098] If the user utters a word "one" in step 452, 
the title designation voice data memory table 19a 
shown in FIG. 3A is searched for information of "disci 

25 trackl". Then, the controller 21 controls the CD player 
corresponding to this information based on the control 
signal to reproduce a musical piece or the like on the 
track 1 of the recording/reproducing medium. 
[0099] If the user utters a word "seven" in step 452, 

30 the title designation voice data memory table 19a is 
searched for information of "band fm! 76.1 MHz". Then, 
the controller 21 controls the radio receiver correspond- 
ing to this information based on the control signal to 
tune to the broadcasting station of 76.1 MHz. 

35 [0100] If the user utters an intended voice (words) 
corresponding to any of the voice data stored in the 
shown in FIG. 3B and the adjusted voice data memory 
table 19c shown in FIG. 3C, it is possible to perform a 
voice-based manipulation, such as activation of an 

40 associated audio unit or adjustment of the equalizer. 
[0101] When it is determined in step 456 that voice 
recognition has not been done properly, the flow moves 
to step 460. In step 460, the controller 21 checks the 
value of the program counter to determine if the check 

45 is the second time. If it is the second time, the flow goes 
to step 462. 

[0102] In step 462, the imitation sound generator 1 7 
generates an imitation sound signal of "Beep Beep". 
The controller 21 sends this imitation sound signal of 

so "Beep Beep" to the speaker amplifier 16 and then out- 
puts a guidance sound of "Beep Beep" from the speaker 
5, notifying registration failure. When the voice registra- 
tion mode is ended, the operation comes again to the 
standby mode and starts the routine in FIG. 1 0 again at 

55 step 100. In other words, if the property of the uttered 
voice cannot be extracted accurately due to the influ- 
ence of noise or the like, the user should perform the 
registering operation from the start. 
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[0103] When it is determined in step 460 that the 
value of the program counter is "1 \ the flow goes to step 
464. In step 464, the count value of the program counter 
is checked to determine whether or not the voice regis- 
tration has taken 2.5 seconds or longer 5 
[0104] When the voice registration has taken 2.5 
seconds or longer, the voice synthesizer 20 reads pre- 
determined voice guidance data from the guidance data 
memory table 19d and generates a guidance voice sig- 
nal, and the imitation sound generator 17 generates an w 
imitation sound signal of "Peep". Then, the controller 21 
supplies those guidance voice signal and imitation 
sound signal to the speaker amplifier 16 and repro- 
duces "Beep Too long" from the speaker 5, warning 
the user that the time for the voice registration is too 75 
long. 

[0105] If the voice registration mode has not been 
carried out properly due to some other factors, the voice 
synthesizer 20 reads predetermined voice guidance 
data from the guidance data memory table 19d and 20 
generates a guidance voice signal, and the imitation 
sound generator 17 generates an imitation sound signal 
of "Peep". Then, the controller 21 supplies those guid- 
ance voice signal and imitation sound signal to the 
speaker amplifier 1 6 and reproduces "Beep .. Try again" 25 
from the speaker 5, requesting the user to make voice 
input again. 

[0106] When this notification in step 464 is com- 
pleted, "2" is set in the program counter and the opera- 
tion restarts at step 450 to allow the user to utter desired 30 
words again. In other words, when the way the user 
utters a voice has not been adequate, the user can per- 
form a voice-based manipulation merely by uttering 
proper words, without manipulating the normal registra- 
tion/voice operation key 6 again, as done in the voice 35 
registration mode. 

[0107] Once the user continuously depresses the 
normal registration/voice operation key 6 for a short 
time, the user can manipulate a desired audio unit sim- 
ply by uttering a voice (words) registered in any of the 40 
voice data memory tables 1 9a-1 9c in accordance with a 
guidance sound. 

[0108] A description will now be given of the opera- 
tion in the case where it is determined in step 114 in 
FIG. 10 that has been depressed for a short time. When 45 
a short depression of the key 7 or 8 occurs, the mode is 
set to the registered voice data search mode and the 
operation goes to a routine shown in FIGs. 15A and 
15B. 

[0109] In step 500, the controller 21 searches the so 
title designation voice data memory table 19a to deter- 
mine if there is registered voice data. When there is no 
registered voice data ("NO"), a guidance sound of "No 
voice registered" is given and then, the flow returns to 
step 100 in FIG. 10. 55 
[0110] When there is registered voice data in step 
500 ("YES"), however, the flow goes to step 502 to 
check a currently active audio unit and determine if reg- 



istered voice data associated with that audio unit is 
present in the title designation voice data memory table 
19a shown in FIG. 3A. When the currently active audio 
unit is the radio tuner which is receiving radio waves 
from the broadcasting station of 81 .1 MHz, for example, 
it is determined whether or not there is registered voice 
data corresponding to the broadcasting station of 81.1 
MHz. 

[01 11] Assuming that there is voice data of a word 
"eight" (eit) corresponding to the broadcasting station of 
81.1 MHz as shown in FIG. 3A, then, the voice synthe- 
sizer 20 reads the voice data of "eight" and performs 
voice synthesizing and outputs a synthesized voice of 
"eight" from the speaker 5. 

[0112] If there is no registered voice data associ- 
ated with a currently active audio unit in step 502 
("NO"), the flow goes to step 506. 
[0113] In the case where the search/forward scan 
key 7 has been depressed for a short time, the voice 
data associated with an active audio unit registered in 
the title designation voice data memory table 19a is 
read in the forward sort direction and is converted into 
synthesized voices, which are output from the speaker 5 
one after another in step 506. In the case where the 
search/reverse scan key 8 has been depressed for a 
short time, the registered voice data associated with an 
active audio unit is read in the reverse sort direction and 
is converted into synthesized voices, which are output 
from the speaker 5 one after another. 
[01 1 4] Accordingly, the user can confirm voice data 
registered in the title designation voice data memory 
table 19a and can recheck the voice data even if the 
user has forgotten it. 

[01 15] In the next step 508, the controller 21 meas- 
ures the time of 8 seconds by means of the program 
timer. In steps 510-518, the controller 21 determines if 
any of the other operational button switches 6-1 1 has 
been depressed for a short time within 8 seconds. 
When such a short key depression is detected, the con- 
troller 21 performs a process corresponding to the 
depressed key and then returns to step 100 in FIG. 10. 
When none of the operational button switches 6-1 1 has 
been depressed for a short time even after passing of 8 
seconds, the operation directly returns to step 100 in 
FIG. 10 from step 508. 

[0116] With the search/reverse scan key 8 
depressed for a short time, when the search/forward 
scan key 7 is depressed for a short time in step 510, the 
flow goes to step 520. In step 520, voice data stored at 
a memory address apart by one address from the 
address of the last voice data produced as a synthe- 
sized voice in the forward sort direction is read out and 
is produced as a synthesized voice. Then, the flow 
returns to step 508. 

[0117] With the search/forward scan key 7 
depressed for a short time, when the search/reverse 
scan key 8 is depressed for a short time in step 512, the 
flow goes to step 522. In step 522, voice data stored at 
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a memory address apart by one address from the 
address of the last voice data produced as a synthe- 
sized voice in the reverse sort direction is read out and 
is produced as a synthesized voice. Then, the flow 
returns to step 508. 5 
[0118] That is, the order of presenting voice data 
registered in the title designation voice data memory 
table 19a is switched from one to the other in steps 520 
and 522. 

[0119] When the unit registration/search key 9 is 10 
depressed for a short time in step 514, the flow goes to 
step 524. In step 524, the unit designation voice data 
memory table 19b shown in FIG. 3B is searched to 
check if there is voice data corresponding to a currently 
active audio unit. If there is such voice data, this voice is 
data is produced as a synthesized voice. When the cur- 
rently active audio unit is the radio tuner, for example, a 
synthesized sound of a word "tuner" (t(j)u:ner) is pro- 
duced. Then, the flow returns to step 508. When there 
is no corresponding voice data, the top voice data in the 20 
unit designation voice data memory table 19b is read 
out and then the flow returns to step 508. 
[0120] When the adjusted voice registration/search 
key 1 0 is depressed for a short time in step 516, the flow 
goes to step 526. In step 526, the adjusted voice data 25 
memory table 19c shown in FIG. 3C is searched to 
check if there is registered voice data associated with 
the equalizer. If there is such voice data, this voice data 
is produced as a synthesized voice. Then, the flow 
returns to step 508. When there is no corresponding 30 
voice data, the top voice data in the adjusted voice data 
memory table 19c is read out and then the flow returns 
to step 508. 

[0121] When any of the other keys, 6 or 10, is 
depressed in step 518, the flow goes to step 528 to per- 35 
form a process corresponding to each depressed key 6 
or 10. Then, the flow moves to step 508. 
[0122] Because the user can confirm voice data 
registered in the title designation voice data memory 
table 19a, the unit designation voice data memory table 40 
19b and the adjusted voice data memory table 19c by 
depressing any of the operational button switches 7, 8, 
9 and 1 0 to set the registered voice data search mode, 
as apparent from the above, the user can check voice 
data again even if he or she has forgotten it. as 
[0123] A description will now be given of the opera- 
tion in the case where it is determined in step 116 in 
FIG. 10 that the search/forward scan key 7 or the 
search/reverse scan key 8 has been continuously 
depressed for 2 or more seconds. When the depression so 
of this key 7 or 8 continues for 2 or more seconds, the 
mode is set to the registered voice data scan mode and 
the processes illustrated in FIG. 8B or FIG. 9B are per- 
formed. When the search/forward scan key 7 has been 
continuously depressed for 2 or more seconds, voice 55 
data already registered in the title designation voice 
data memory table 19a shown in FIG. 3A is read 
(scanned) in the forward sort direction and is sequen- 



tially produced as synthesized voices. If the normal reg- 
istration/voice operation key 6 is depressed during the 
action, an audio unit corresponding to the last searched 
or scanned voice data is controlled based on this voice 
data. 

[0124] When the search/reverse scan key 8 has 
been continuously depressed for 2 or more seconds, 
voice data already registered in the title designation 
voice data memory table 19a shown in FIG. 3A is read 
(scanned) in the reverse sort direction and is sequen- 
tially produced as synthesized voices. If the normal reg- 
istration/voice operation key 6 is depressed during the 
action, the currently active audio unit corresponding to 
the last searched or scanned voice data is controlled 
based on this voice data. 

[0125] A description will now be given of the opera- 
tion in the case where it is determined in step 118 in 
FIG. 10 that the unit registration/search key 9 has been 
continuously depressed for 2 or more seconds. When 
the depression of this key 9 continues for 2 or more sec- 
onds, the mode is set to the unit designation voice data 
search mode and the process shown in FIG. 5B is exe- 
cuted. Specifically, voice data associated with the name 
of the currently active audio unit, which is already regis- 
tered in the unit designation voice data memory table 
19b, is produced as a synthesized voice. When voice 
data associated with the name of the currently active 
audio unit is not registered, the mode is switched to the 
unit designation voice data scan mode for sequentially 
producing voice data associated with the names of 
other audio units as synthesized voices. When the unit 
registration/search key 9 is depressed again during the 
unit designation voice data scan mode, the mode is 
switched to the one that produces the voice data, which 
is associated with the name of the currently active audio 
unit and is already registered in the unit designation 
voice data memory table 19b, as a synthesized voice. 
When the normal registration/voice operation key 6 is 
depressed during the unit designation voice data search 
mode or the unit designation voice data scan mode, the 
currently active audio unit corresponding to the last 
searched or scanned voice data is controlled based on 
this voice data. 

[0126] A description will now be given of the opera- 
tion in the case where it is determined in step 120 in 
FIG. 10 that the adjusted voice registration/search key 
10 has been continuously depressed for a short time. 
When a short depression of this key 10 occurs, the 
mode is set to the adjusted voice data search mode and 
the processes shown in FIG. 6C are executed. Specifi- 
cally, voice data which is associated with the currently 
set positioning state or the current frequency character- 
istic of the equalizer and is registered in the adjusted 
voice data memory table 19c shown in FIG. 3C is pro- 
duced as a synthesized voice. When the adjusted voice 
registration/search key 10 is depressed during the 
adjusted voice data search mode, all the pieces of voice 
data registered in the adjusted voice data memory table 
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1 9c is scanned and are sequentially produced as syn- 
thesized voices. If the normal registration/voice opera- 
tion key 6 is depressed during the action, the currently 
active audio unit corresponding to the last searched or 
scanned voice data is controlled based on this voice 
data. 

[0127] To perform a voice-based manipulation, as 
apparent from the foregoing description, the voice- 
based manipulation apparatus of this embodiment 
searches or scans the voice data registered in the title 
designation voice data memory table 19a, the unit des- 
ignation voice data memory table 1 9b and the adjusted 
voice data memory table 19c and produces the 
searched or scanned voice data as a synthesized voice. 
Even if the user does not remember registered voice 
data, the user can easily check the correlation between 
registered voices and their corresponding manipulation 
targets. Unlike the prior art, therefore, it is unnecessary 
to register voice data again from the beginning, thus 
demonstrating an excellent operability. 
[0128] As a plurality of operational functions are 
assigned to each of the operational button switches 6- 
1 1 , it is possible to reduce the number of required oper- 
ational button switches, which can contribute to design- 
ing the remote operation section 4 more compact. 
[0129] Although the foregoing description of this 
invention has been given of the embodiment which is 
designed to perform voice-based manipulation of an 
audio system, this invention is not limited to a voice- 
based manipulation apparatus for audio systems. For 
example, this invention may be adapted to an on-board 
unit for a vehicle which has an air-conditioning system 
equipped in addition to an on-board audio system, so 
that those audio system and air-conditioning system 
can be manipulated by voices. Further, this invention is 
not limited to an audio system, but may be adapted to 
manipulate various other manipulation targets with 
voices. 

[0130] In short, the voice-based manipulation appa- 
ratus according to this invention has a search section 
which searches voice information stored in a storage 
section in association with a manipulation target and 
produces the searched voice information. Even if a user 
has forgotten any registered voice information, for 
example, this apparatus can provide the user with the 
correlation between the voice information and the asso- 
ciated manipulation target. This makes it unnecessary 
for the user to store the voice information again in the 
storage section due to a memory problem or the like 
and provides the user an excellent operability. 

Claims 

1 . A voice-based manipulation apparatus comprising: 

a storage section for storing voice information 
for specifying manipulation targets in associa- 
tion with said manipulation targets; 



a manipulation section for, when a voice is sup- 
plied, manipulating that of said manipulation 
targets which is associated with that of said 
voice information stored in said storage section 
5 which corresponds to said voice; and 

a search section for searching said voice infor- 
mation stored in said storage section in associ- 
ation with said manipulation target and 
presenting resultant voice information. 

10 

2. The voice-based manipulation apparatus according 
to claim 1 , wherein in response to a search instruc- 
tion externally supplied, said search section detects 
an active manipulation target, searches for that 

15 voice information which is associated with said 
detected active manipulation target and presents 
said searched voice information. 

3. The voice-based manipulation apparatus according 
20 to claim 2, wherein when voice information associ- 
ated with said active manipulation target is not 
stored in said storage section, said search section 
searches other voice information stored in said 
storage section in association with said manipula- 

25 tion target and presents said searched voice infor- 
mation. 

4. The voice-based manipulation apparatus according 
to claim 2 or 3, wherein in response to said search 

30 instruction externally supplied, said search section 
searches said voice information stored in said stor- 
age section in a predetermined order in association 
with said manipulation target and presents said 
searched voice information. 

35 

5. The voice-based manipulation apparatus according 
to claim 4, wherein said predetermined order is an 
alphabetical order. 

40 6. The voice-based manipulation apparatus according 
to claim 4, wherein said predetermined order is a 
forward sort direction. 

7. The voice-based manipulation apparatus according 
45 to claim 4, wherein said predetermined order is a 

reverse sort direction. 

8. The voice-based manipulation apparatus according 
to any one of claims 1 to 7, wherein said storage 

so section can store said voice information again and 
stores a supplied voice as voice information associ- 
ated with an active manipulation target at a time of 
storing said voice information again. 

55 9. A voice-based manipulation method comprising the 
steps of: 

storing voice information for specifying manipu- 
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lation targets in a storage section in association 
with said manipulation targets; 
manipulating, when a voice is supplied, that of 
said manipulation targets which is associated 
with that of said voice information stored in said s 
storage section which corresponds to said 
voice; and 

searching said voice information stored in said 
storage section in association with said manip- 
ulation target and presenting resultant voice w 
information. 

10. The voice-based manipulation method according to 
claim 9, wherein in response to a search instruction 
externally supplied, said searching step detects an is 
active manipulation target, searches for that voice 
information which is associated with said detected 
active manipulation target and presents said 
searched voice information. 

20 

11. The voice-based manipulation method according to 
claim 10, wherein when voice information associ- 
ated with said active manipulation target is not 
stored in said storage section, said searching step 
searches other voice information stored in said 25 
storage section in association with said manipula- 
tion target and presents said searched voice infor- 
mation. 

12. The voice-based manipulation method according to 30 
claim 10 or 11, wherein in response said search 
instruction externally supplied, said searching step 
searches said voice information stored in said stor- 
age section in a predetermined order in association 
with said manipulation target and presents said 35 
searched voice information. 

13. The voice-based manipulation method according to 
claim 12, wherein said predetermined order is an 
alphabetical order. 40 

14. The voice-based manipulation method according to 
claim 12, wherein said predetermined order is a for- 
ward sort direction. 

45 

15. The voice-based manipulation method according to 
claim 12, wherein said predetermined order is a 
reverse sort direction. 

16. The voice-based manipulation method according to 50 
any one of claims 9 to 1 5, wherein said storage sec- 
tion can store said voice information again and 
stores a supplied voice as voice information associ- 
ated with an active manipulation target at a time of 
storing said voice information again. 55 
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